Speaker dependent mapping for low bit rate coding of throat microphone speech

نویسندگان

  • Joseph M. Anand
  • Bayya Yegnanarayana
  • Sanjeev Gupta
  • M. R. Kesheorey
چکیده

Throat microphones (TM) which are robust to background noise can be used in environments with high levels of background noise. Speech collected using TM is perceptually less natural. The objective of this paper is to map the spectral features (represented in the form of cepstral features) of TM and close speaking microphone (CSM) speech to improve the former’s perceptual quality, and to represent it in an efficient manner for coding. The spectral mapping of TM and CSM speech is done using a multilayer feed-forward neural network, which is trained from features derived from TM and CSM speech. The sequence of estimated CSM spectral features is quantized and coded as a sequence of codebook indices using vector quantization. The sequence of codebook indices, the pitch contour and the energy contour derived from the TM signal are used to store/transmit the TM speech information efficiently. At the receiver, the allpole system corresponding to the estimated CSM spectral vectors is excited by a synthetic residual to generate the speech signal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping Speech Spectra from Throat Microphone to Close-Speaking Microphone: A Neural Network Approach

Speech recorded from a throat microphone is robust to the surrounding noise, but sounds unnatural unlike the speech recorded from a close-speaking microphone. This paper addresses the issue of improving the perceptual quality of the throat microphone speech by mapping the speech spectra from the throat microphone to the close-speaking microphone. A neural network model is used to capture the sp...

متن کامل

Compensation of Chann Spectrum Freq

Line Spectrum Frequencies (LSFs) is an effective and efficient representation for low bit-rate (LBR) speech coding. It is also appealing to use LSFs in speech or speaker recognition within a digital communication based system. However, the channel effect on LSFs degrades the recognition performance. This paper attempts to treat the problem of channel effect in LSF domain so that the recognition...

متن کامل

Speaker-dependent mapping of source and system features for enhancement of throat microphone speech

A throat microphone (TM) produces speech which is perceptually poorer than that produced by a close speaking microphone (CSM) speech. Many attempts at improving the quality of TM speech have been made by mapping the features corresponding to the vocal tract system. These techniques are limited by the methods used to generate the excitation signal. In this paper a method to map the source (excit...

متن کامل

A new statistical excitation mapping for enhancement of throat microphone recordings

In this paper we investigate a new statistical excitation mapping technique to enhance throat-microphone speech using joint analysis of throatand acoustic-microphone recordings. In a recent study we employed source-filter decomposition to enhance spectral envelope of the throat-microphone recordings. In the source-filter decomposition framework we observed that the spectral envelope difference ...

متن کامل

Throat microphone signal for speaker recognition

Speaker recognition systems perform better when clean speech signals are used for the task. In the presence of high levels of background noise, speech recorded from a close speaking microphone will be degraded and hence the performance of the speaker recognition system. Use of a transducer held at the throat results in a signal that is clean even in a noisy environment. This paper discusses the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009